cross-domain mapping
Guided Manifold Alignment with Geometry-Regularized Twin Autoencoders
Rhodes, Jake S., Rustad, Adam G., Nielsen, Marshall S., McClellan, Morgan Chase, Gardner, Dallan, Hedges, Dawson
Abstract--Manifold alignment (MA) involves a set of techniques for learning shared representations across domains, yet many traditional MA methods are incapable of performing out-of-sample extension, limiting their real-world applicability. We propose a guided representation learning framework leveraging a geometry-regularized twin autoencoder (AE) architecture to enhance MA while enabling generalization to unseen data. Our method enforces structured cross-modal mappings to maintain geometric fidelity in learned embeddings. By incorporating a pre-trained alignment model and a multitask learning formulation, we improve cross-domain generalization and representation robustness while maintaining alignment fidelity. We evaluate our approach using several MA methods, showing improvements in embedding consistency, information preservation, and cross-domain transfer . Additionally, we apply our framework to Alzheimer's disease diagnosis, demonstrating its ability to integrate multi-modal patient data and enhance predictive accuracy in cases limited to a single domain by leveraging insights from the multi-modal problem. Manifold learning encompasses a set of methods used to create a lower-dimensional representation, or an embedding, of higher-dimensional data. Such representations can form a key role in data visualization [1]-[5], dimensionality reduction as a preprocessing step for subsequent machine-learning or analytical tasks [6], or serve as a denoising mechanism [4]. In the context of multi-domain problems, where multiple types of data are considered, manifold learning becomes more challenging as data distributions across different domains or modalities may exhibit domain-specific variations while still sharing a common geometric structure. Manifold alignment (MA) seeks to address this problem. In some contexts, a common, shared representation of multi-modal data can be viewed as a natural extension of manifold learning. For example, cell samples of the same type but collected at a different time or using different methodologies should still share features in common, but differences in the measured features may occur due to batch effects [7], obscuring the similarities.
A Nurse is Blue and Elephant is Rugby: Cross Domain Alignment in Large Language Models Reveal Human-like Patterns
Yehudai, Asaf, Karidi, Taelin, Stanovsky, Gabriel, Goldstein, Ariel, Abend, Omri
Cross-domain alignment refers to the task of mapping a concept from one domain to another. For example, ``If a \textit{doctor} were a \textit{color}, what color would it be?''. This seemingly peculiar task is designed to investigate how people represent concrete and abstract concepts through their mappings between categories and their reasoning processes over those mappings. In this paper, we adapt this task from cognitive science to evaluate the conceptualization and reasoning abilities of large language models (LLMs) through a behavioral study. We examine several LLMs by prompting them with a cross-domain mapping task and analyzing their responses at both the population and individual levels. Additionally, we assess the models' ability to reason about their predictions by analyzing and categorizing their explanations for these mappings. The results reveal several similarities between humans' and models' mappings and explanations, suggesting that models represent concepts similarly to humans. This similarity is evident not only in the model representation but also in their behavior. Furthermore, the models mostly provide valid explanations and deploy reasoning paths that are similar to those of humans.
Non-Adversarial Mapping with VAEs
The study of cross-domain mapping without supervision has recently attracted much attention. Much of the recent progress was enabled by the use of adversarial training as well as cycle constraints. In a recent paper, it was shown that cross-domain mapping is possible without the use of cycles or GANs. Although promising, this approach suffers from several drawbacks including costly inference and an optimization variable for every training example preventing the method from using large training sets. We present an alternative approach which is able to achieve non-adversarial mapping using a novel form of Variational Auto-Encoder.